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Abstract 

Baranyai's theorem is a well-known theorem in the theory of hypergraphs. A 
corollary of this theorem says that one can partition the family of all u-subsets of 
an n-element set into (I!Zi) sub-families such that each sub-family form a partition 
of the n-element set, where n is divisible by u. In this paper, we present a coding- 
theoretic application of Baranyai's theorem (or equivalently, the corollary). More 
precisely, we propose the first purely combinatorial construction of locally decodablc 
codes. Locally decodablc codes are error-correcting codes that allow the recovery of 
any message bit by looking at only a few bits of the codeword. Such codes have 
attracted a lot of attention in recent years. We stress that our construction does not 
improve the parameters of known constructions. What makes it interesting is the 
underlying combinatorial techniques and their potential in future applications. 

1 Introduction 

1.1 Baranyai's Theorem 

Let f2 = . . . ,co n } be a finite set of cardinality n. A hypergraph on £1 is a family 
H = {Ei, . . . , E m } of nonempty subsets of ft such that Uj=i Ej = ^- The n elements 
uii , . . . , io n are called vertices of H and the m subsets E\ , • • • , E m are called edges of H. 
For every vertex w £ H, the star with center u is denoted by H{u) and defined to the 
set of all edges of H that contain uj, i.e., H(u) = {E G H : uj G E}. Fhe degree of u is 
denoted by dn(oj) and defined to be \H(uj)\. The hypergraph H is said to be r -regular if 
dn(oj) = r for every oj G Q and u-uniform if \E\ = u for every E G H. As an alternative, 
the hypergraph H can also be defined by its incidence matrix A = (aij) nxm , where the 
rows of A are labeled by the n vertices u)±, . . . , u n , the columns of A are labeled by the m 
edges Ei, ... , E rn , and 

I 1 ;f , ,. n TP.. 

(1) 

for every i G [n] = {1,2, ... ,n} and j G [m]. It is straightforward to see that every 
row of the incidence matrix of an r-regular hypergraph contains exactly r l's and every 
column of the incidence matrix of a u-uniform hypergraph contains exactly u l's. For 
every J C [m], the partial hypergraph of H generated by J is defined to be the sub- family 
H = {Ej : j £ J} of H, where the vertex set of H is Cl = \Jj eJ Ej. It is also easy to see 
that the incidence matrix A of H is in fact a submatrix of A whose rows are labeled by 
elements in and columns are labeled by elements in H. Let u G [n]. The hypergraph 
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H is called a u-complete hypergraph of order n and denoted by if it consists of all 
it-subsets of fl. In particular, we have that m = ( n ) when H = K%. 

The study of complete hypergraphs have been an interesting problem in the theory 
of hypergraphs (see Section 5 of Chapter 4 in Berge [2]). In particular, Baranyai had an 
in-depth study of the edge colorings (see page 137 of [2] for the definition) of the complete 
hypergraphs and obtained the following theorem (Theorem 11, page 143 of [2]): 

Theorem 1.1 (Baranyai 1975) Let n and u be integers such that n > u > 2. Let 
mi,...,mjt be k positive integers such that m\ + • • • + m\~ = rj. Then the complete 
hypergraph can be divided into k partial hypergraphs H± , . . . , H^ such that 

• \Hi\ = mi for every i E [k]; 



\Hi fl Hj\ = whenever [k] and i ^ j 
umi 



urn; 



n 



n 



and 

for every i G [k] and u G fl. 



m k 



n/u and have the 



In particular, when u\n, one can set k = and mi = • • 

following corollary: 

Corollary 1.1 (Baranyai 1975) If u\n, then the complete hypergraph K% can be divided 
into k = ("Zi) partial hypergraphs H\, . . . ,Hk such that 

• Hi is a 1-regular hypergraph of order n and has n/u edges for every i € [k]; and 

• \Hi n Hj\ =0 whenever i,j G [k] and i ^ j. 

In fact, Corollary 11.11 says that the family of all li-subsets of an n-element set fl can 
be partitioned into ("Zi) sub-families such that each sub-family forms a partition of the 
set fl. 

Example 1.1 Let n = 6 and u = 2. The incidence matrix A of can be depicted by 
Fiaure [T7[ where the rows and columns of A are labeled by elements offl = {1, 2, 3, 4, 5, 6} 
and all 2-subsets of fl, respectively. We can divide into 5 partial hypergraphs: Hi = 
{12, 34, 56}, # 2 = {13, 25, 46}, H 3 = {14, 26, 35}, H 4 = {15,24,36} and H 5 = {16,23,45} 
such that both consitions in Corollary are satisfied. We also highlight the incidence 
matrix of H\ in Figure [TTil 
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Figure 1: Incidence matrix of K§ 



Baranyai's theorem has found many interesting applications (for example, see Section 
6, Chapter 4 in [2]). In this paper, we present a new application of this theorem in the 
construction of locally decodable codes [8]. 
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1.2 Locally Decodable Codes 

Let F be a finite field. A classical error-correcting code [6] C : ¥ n — > ¥ N allows one 
to encode any message x = x\ ■ ■ ■ x n as a codeword C(x) such that the message can be 
recovered even if C{x) gets corrupted in a number of coordinates. However, to recover 
even one symbol of the message, one has to consider all or most of the coordinates of the 
codeword. In such a scenario, more efficient schemes are possible and they are known as 
locally decodable codes. In such codes, a probabilistic decoder D can recover any particular 
symbol Xi of the message with very good probability by looking at several coordinates of 
C(x) even if a constant fraction of C(x) has been corrupted. For any y, z G F^, we denote 
by A(y, z) their Hamming distance, i.e., the number of coordinates where they differ. 

Definition 1.1 (Locally Decodable Code) A code C : ¥ n — > ¥ N is said to be (p,5,e)- 
locally decodable if there is a probabilistic decoder D (which uses random coins in decoding) 
such that 

1. for every x G ¥ n ,i G [n] and y G F^ such that A(C(x),y) < 5N, it holds that 
Pr[D y (i) = Xi] > 1 — e, where the probability is taken over the random coins of D 
and D y means that D only looks at a number of coordinates of y; 

2. D looks at at most p coordinates of the word y. 

The quality of C is measured by its query complexity p and length N (both as a function 
of n). Ideally, one would like both p and N to be as small as possible. 

Example 1.2 (Walsh-Hadamard Code, page 249 or 382 of [1]) The best example of locally 
decodable code is the well-known Walsh-Hadamard code C : {0, 1}™ — > {0, l} 2 ™ whose 
generator matrix takes all vectors in {0, l} n as columns. For every message x G {0, l} n , 
the coordinates of C{x) are labeled by the vectors in {0, l} n . In particular, the coordinate 
labeled by v G {0, 1}™ is equal to Y17=l x i v i mod 2. Given a word y G {0, l} 2 " such that 
A(C(x),y) < 5 ■ 2™, a decoder D may recover a bit Xi by looking at two random bits of 
y labeled by v,v + ei G {0, l} n and then outputs their sum, where e{ = (0, . . . , 1, . . . , 0) G 
{0, l} n is the ith unit vector. Clearly, each of the two bits is corrupted with probability 
at most 5 and thus the decoder can output the correct Xi with probability at least 1 — 25. 
Hence, the Walsh- Haramard code C is a (2, 5, 25) -locally decodable code that encodes k-bit 
messages as 2 n -bit codewords (i.e., N = 2 n ). 

Katz and Trevisan [5] were the first to formally define locally decodable codes. In 
recent years, the construction of locally decodable codes have attracted a large amount of 
attention HI [3]. The interested readers are referred to Yekhanin [8j for a good survey 
of locally decodable codes. 

1.3 Results 

While the series of works mentioned above focus on improving the parameters (i.e., p 
and N) of the locally decodable codes and require nice algebraic ideas, in this paper 
we are interested in the connection between locally decodable codes and combinatorial 
objects in discrete mathematics. In particular, we propose the first purely combinatorial 
construction of locally decodable codes which is based on the hypergraphs in Section 11.11 
More precisely, we show the following theorem: 
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Theorem 1.2 For any odd positive integer p, there is a binary linear (p,5, p 2 5/(p — 1))- 
locally decodable code that encodes n-bit messages as 2 H ^' /f ^ n -bit codewords, where H(s) = 
— slog 2 s — (1 — s) log 2 (l — s) is the binary entropy function. 

2 The Combinatorial Construction 

In this section, we present our purely combinatorial costruction of locally decodable codes 
and prove Theorem 11.21 We firstly give a technical lemma and then present both the 
encoding and decoding algorithms of our locally decodable codes. 

2.1 A Technical Lemma 

Lemma 2.1 Let C : F™ — >• ¥ N be a linear code with generator matrix G = [G±, . . . , Gat], 
where G\, . . . , Gn G F n are the columns of G. Let k be a positive integer such that k\XN, 
where < A < 1 and XN is an integer. Suppose there are n subsets T±, . . . ,T n C [N], each 
of cardinality XN , such that 

(i) for every i G [n], the set Ti has a partition T{ = Tn U- • • UTjfc such that \Tij\ = XN/k 
for every j G [k] ; and 

(ii) for every i G [n] and j G [k], the ith unit vector e, is a linear combination of 
{Gt : £ G T{j}. 

Then the code C is (p, 5, p5 / X) -locally decodable, where p = XN/k. 

Proof: We need to provide a probabilistic decoder D for the code C such that both 
requirements in Definition 1 1 . 1 1 are satisfied. Let x G ¥ k be any message and let C(x) be its 
codeword. As required, the decoder D is given access to the coordinates of a word y G 
and asked to recover a particular symbol of the message x, say Xi, where i G [n]. Given 
the public knowledge of the subsets T\, . . . ,T n and their partitions, our decoder D picks 
a random integer j G [k] and looks at the coordinates of y labeled by the elements of Tjj, 
i.e., {ye : £ G Tij}. Since the ith unit vector ej is a linear combination of {Gg : £ G Tjj}, 
there are \Tij\ = XN/k field elements {q : i G T^} such that e, = X^eTV, c t^t- Knowing 
the field elements {q : I G Ty}, our decoder simply outputs J^eer^ c zVt- 

Firstly, it is clear that our decoder looks at XN/k = p coordinates of y, i.e., its query 
complexity is p. Secondly, we shall compute Pr[D v (i) = Xj\. If y = C(x), then we have 
that 

c e ye = q • (x, G t ) = (x, ^ c t Gi) = {x, e*) = x h (2) 

i&T i:j eeT i:j eeTij 

where (■, •) stands for the standard dot product. Clearly, the equation ([2]) says that D 
always succeeds in recovering x% when y = C(x), i.e., Pr[D c ( x \i) = Xj\ = 1. Now suppose 
that y G is a word such that A(C(x),y) < 5N for a small constant < S < 1. 
In this case, the decoder D may not output Xj correctly since some of the coordinates 
{yi : £ G Tij} may have been corrupted and consequently the left hand side of ([2]) is not 
equal to Xj. We say that j G [k] is bad if there is at least one I G T^ such that is not 
equal to the ith coordinate of C(x)e; otherwise, j is called good. Equation (2) shows that 
D can correctly output X{ when the j is good. Therefore, Pv[D y (i) = xi\ is at least the 
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probability that the j chosen by D is good. However, since A(C(x),y) < 5N, at most 5N 
of the coordinates of yi are not consistent with C (x) and thus at most A N of the indices 
j € [k] are bad. It follows that 

¥r[D y {i) = xi] > Pr\j is good] > 1 - 5N/k = 1 - pS/X, 

which in turn implies that C is a (p, 5, p5/A)-locally decodable code that encodes messages 
of n symbols as codewords of N symbols. □ 



2.2 The Construction 

In this section, we present our combinatorial construction of locally decodable codes. Our 
construction is based on the combinatorial objects called hypergraphs that are defined in 
Section fTTl 

The locally decodable codes we construct here are linear codes over the binary field 
F2. Let p > 1 be any odd positive integer and let n = pu + 1 for any positive integer u. 
Let be the u-complete hypergraph over an n-element vertex set fl = {uj\, . . . ,co n } and 
let A = (aij) n xN = [A\, . . . ,An] be the incidence matrix of K%, where N = (") is the 
number of all u-subsets of f2 and Aj is the jth column of A for every j G [N]. Since the 
entries of A are either or 1, we can consider the matrix A over the binary field F2. We 
define G = (gij) n xN = [G±, ■ ■ ■ , Gjy] be a binary matrix such that gij = 1 + for every 
i € [n] and j 6 [N] , where Gj is the jth column for every j € [N] . Our code C : F% — > F^ 
has generator matrix G, i.e., any message x € F2 will be encoded as C(x) = xG. 

We have to show that the code C we constructed above is locally decodable. In fact, 
this is a consequence of Corollary 11.11 and Lemma 12.11 Formally, we have that 

Theorem 2.1 The code C : F2 — > F^ is (p,S, pS/X) -locally decodable, where A = 1 — u/n. 

Proof: Let k = XN/p = ( n ~ 1 ) /p. Due to Lemma 2, we only need to show that there are 
n subsets T\, . . . ,T n C [N] such that both (i) and (ii) hold. For every i € [n], let 

Ti = {j E [N] : 9ij = 1}. 

As the incidence matrix A, we can consider the rows and columns of G are labeled by the 
n elements of £1 and the N n-subsets of Q, respectively. Due to the definition of G, it is 
then straightforward to see that Tj corresponds to the set of all u-subsets of CI \ {uji} for 
every i G [n]. Clearly, we have that |Ti| = • • • = \T n \ = ( n ~ ) = AA^. 

We consider the submatrix of G that consists of all columns of G labeled by Tj (or 
equivalently, by all u-subsets of f2\{wj}). Let J be the all-one matrix of size (n— 1) x ( n ~ ) 
and B be the incidence matrix of the u-complete hypergraph of order n — 1 (i.e., K%_i) 
on Cl \ {uji}. Clearly, we have that 




where 1 is the all-one row vector of dimension ( n u x ) , F\ consists of the first i — 1 rows 
of J — B and F2 consists of the last n — 1 — i rows of J — B. Since u\(n — 1), Corollary 
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11.11 implies that we can partition the set of columns of G^ 1 into k = (™Zi) subsets, say 
Tu, ■ ■ ■ ,Tik C Ti, such that each T^ is a partition of £1 \ {oJi} for every j G [k]. It is clear 
that \Tij\ = (n — l)/u = XN/k for every j £ [k], which implies that (i) holds. On the other 
hand, for every j 6 [k], we consider the submatrix G™' of that consists of all the 
columns of labeled by elements in T^. Clearly, there are p subsets Si, . . . , S„ C f2\{co>j} 
of cardinality u corresponding to the indices in Tij. In particular, these subsets form a 
partition of Vt \ {wj}. We note that every i £ [n] \ {i} appears in exactly one of the p sets 
and therefore the zth row of Gw) contains exactly p — 1 l's and one O's. On the other 
hand, the ith row of Gto) is the all -one vector of dimension p. It follows that the sum of 
the columns of G^'^ is 

(p-l)-JA 

(p-i)-^y ' 

where Ji is the all-one vector of dimension % — 1 and J2 is the all-one vector of dimension 
n — i. Clearly, this is the ith unit vector over the binary field since p is an odd integer. 
In other words, (ii) holds. Due to Lemma [2.11 our code C must be (p, 8, p5/A)-locally 
decodable. □ 

In our construction, the query complexity p of the code C should be a constant and 
the quality of the code should be measured by the asymptotic length N as a function of 
n. Due to basic mathematics, we have that 

\uj \(n - I)/ p) ~ 

where H(s) = —s log 2 s — (1 — s) log 2 (l — s) is the binary entropy function. It is clear that 
our code is asymptotically more efficient than the Welsh-Hadamard code whenever p > 1 
is an odd integer. For example, we have that ./V < 2°' 92n when p = 3; and N < 2 0A4n 
when p = 1, which is less than the square root of the length of the Walsh-Hadamard 
code. On the other hand, we should also have an estimation of the probability that our 
decoder D gives an incorrect output, i.e., e = pS/X. However, it is easy to see that 
A = 1 — u/n = 1 — u/(pu + 1) > 1 — 1/p and therefore e < p 2 5/(p — 1), which is a constant 
as long as 5 is a constant. Hence, we have the following theorem: 

Theorem 2.2 For every odd integer p > 1, there is a binary linear (p,5, p 2 5/(p — 1))- 
locally decodable code that encodes n-bit messages as 2 H ( 1 * /p )" -bit codewords. 

Example 2.1 Let n = 7 and u = 2. Then generator matrix G of our (3, 5, 4.55) -locally 
decodable code C can be depicted by the Figure \2.2[ where the rows and columns of G 
are labeled by elements of £1 = {1,2,3,4,5,6,7} and all 2-subsets ofQ, respectively. For 
explanation, we also highlight the set T-j and the matrix G^> in Figure \2~2\ As an example, 
in order to recover the 7th message bit, our decoder D may look at 3 coordinates of the 
received word y that are labeled by any one the following subsets: 

Tn = {12, 34, 56}, T 72 = {13, 25, 46}, T 73 = {14, 26, 35}, 

T 74 = {15, 24, 36}, T 75 = {16, 23, 45}. 
In fact, these sets correspond to the partition of Kq we noted in Example 
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Figure 2: Generator matrix of a 3-query locally decodable code 

3 Concluding Remarks 

In this paper, we present an application of Baranyai's theorem. In particular, we con- 
structed a (p, 5, p 2 5/(p— l))-locally decodable code that encodes n-bit messages as 2 H ( 1 / p ) n - 
bit messages, where p > 1 is any odd integer. Our construction does not improve the 
parameters of the known constructions to date. However, it is still interesting in the sense 
that the underlying techniques are purely combinatorial while all the known constructions 
heavily rely on algebraic techniques. 
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